Visual methods for exploring multivariate spatio-temporal networks with application to health transport

Confirmation Report

Author

Krisanat Anukarnsakulchularp

Background

Analysing spatio-temporal network data is a contemporary research problem that has gained increasing interest in the health field, particularly within emergency medical services (EMS) and ambulance transfer systems.  Such data capture spatial, temporal, and often multivariate information. The spatial component generally represents geographic locations or spatial geometries, while the temporal component records time-related information through timestamps or time intervals (Rao, Govardhan, and Rao 2012). In addition, the underlying network structure creates connections and multivariate dependencies between locations and transfers. While techniques exist to analyse spatial and temporal components separately, performing analysis, and perhaps more importantly, exploring these components in conjunction with the network structure, remains an open challenge.

Older individuals often require continuous support, including 24-hour care, assistance with daily tasks, and ongoing medical supervision. Thus, many reside in the residential aged care facilities (RACFs), which are specifically designed to provide this comprehensive care (Kearney and Winterbottom 2006). RACFs frequently rely on the ambulance services to facilitate the transfers of an individual to the hospital for both acute emergencies and planned/scheduled medical appointments. This rise in the number of transfers is partly due to population ageing (Harris and Sharma 2018), which puts incredible pressure on emergency medical services, where delay could lead to an increase in health risk (Harmsen et al. 2015). During the COVID-19 pandemic, lockdown measures and movement restrictions further disrupted the delivery of emergency services. The effects of lockdowns and rising transfer demand highlight the need for further analysis to improve the planning and utilisation of ambulance services.

To gain insight into transfer patterns, data exploration using network representations linking RACFs and hospitals provides a powerful framework. However, most network research focuses primarily on topological properties, often treating them homogeneously and overlooking other important information, such as the association between variables (Cardenas et al. 2021; Fernández-Gracia et al. 2017). While network representation is suited to transfer data, overemphasising network topology can neglect the fundamental principles of data exploration. These limitations arise from the practical challenges of working with spatio-temporal network data, including data cleaning methods, particularly temporal information, the ease of data wrangling and subsetting, and the challenges of visualisation and inference. As a result, simple informative analyses, such as examining variable distributions, temporal trends, or bivariate relationships, are often underutilised, despite the ability to reveal key insights of the data. This underlines the need for an infrastructure that integrates network-based approaches with exploratory data analysis (EDA), enabling a comprehensive exploration of spatio-temporal transfer networks.

Studying how infectious diseases spread throughout the network (transfer between RACFs and hospitals) is important because the older population tend to face a higher risk of mortality during the outbreaks (Parohan et al. 2020). These patient transfers between facilities create ways for the disease to be transmitted across the systems, leading to rapid spread. Traditional compartmental infectious disease models assuming homogeneous or static structure do not adequately capture networks that change over time. In reality, ambulance transfers are highly dynamic, where these connections between facilities can change in response to the demand, constraints, and even outbreak conditions. Understanding these transmission dynamics is therefore crucial for devising effective policies to limit spread as well as identify high-risk facilities, critical transfer connections, and exposed periods.

Project 1: Developing Infrastructure for Exploratory Analysis of Multivariate Spatio-temporal Network with Application to Ambulance Transfers

Part A: Exploratory Data Analysis Infrastructure for Multivariate Spatio-temporal Network

As multivariate spatio-temporal network data become more accessible and complex, understanding their structure and dynamics is key to effective decision-making. As mentioned in Section 1, a major challenge in analysing large multivariate networks lies in the sheer amount of information it contains, most of which is often overlooked. This infrastructure aims to support the exploration of multivariate spatio-temporal network data. The exploratory data analysis involves several key processes: data storage, cleaning, subsetting, and visualisation. The following section, therefore, reviews existing tools that support these processes and discusses their limitations.

Data Storage and Cleaning

Data cleaning is the first stage of a reliable analysis. Spatio-temporal data usually need to be checked for inconsistency of the temporal records, duplicated records, and spatial inaccuracies. Now, adding the network structure on top of that, such as nodes, edges, and their attributes, requires the network topology to be kept throughout the process. Typically, this stage involves tools such as dplyr (Wickham et al. 2023) for manipulating the data, tsibble (Wang, Cook, and Hyndman 2020) for validating the temporal inconsistency, sf (Pebesma 2018) for checking the coordinate inaccuracies, and igraph/network (Csárdi et al. 2026; Butts 2008) for keeping the network structure.

The tidygraph (Pedersen 2024b) package provides a tidy API for graph and network manipulation, where network data is thought of as two tidy tables, one for node and one for edge data. In tidy data (Wickham 2014), each variable has its own column, each observation has its own row, and each value has its own cell. These tables are then stored together within a tbl_graph object, which preserves the underlying network topology while allowing standard dplyr verbs to be applied. The interaction between node and edge tables is done through the use of a special function, activate(), which allows the user to switch between the two tables and apply dplyr operations such as mutate(), group_by(), and join operations.

There are two main functions for creating tbl_graph object, as_tbl_graph() and tbl_graph(). The first function as_tbl_graph() takes in a different class of objects, such as data.frame, igraph, and network, then turns it into a tbl_graph object. While tbl_graph() takes in two data.frame objects, one for node and one for edge.

The difference between these two methods is that for the as_tbl_graph() function, it only needs the edges dataset, which means that all the multivariate information will only be on edge data and in the node data, it will only have the name (location). For the tbl_graph() function, the node variable can be explicitly stated, which can come in handy when there are attributes on the node dataset.

as_tbl_graph(edges)

tbl_graph(nodes, edges)

graph |>
  activate(edges) |>
  mutate(year = lubridate::year(casedate))
# A tbl_graph: 815 nodes and 102073 edges
#
# A directed acyclic multigraph with 6 components
#
# Edge Data: 102,073 × 9 (active)
    from    to casedate     age gender diagnosis         daytype single_id  year
   <int> <int> <date>     <dbl> <chr>  <chr>             <chr>       <dbl> <dbl>
 1   575   659 2019-04-29    96 Female OTHER - SPECIFY   weekda…  10097915  2019
 2   514   628 2022-05-31    88 Female SHORT OF BREATH   weekda…  13485226  2022
 3   522   628 2020-10-03    90 Male   LACERATION        weeken…  11574122  2020
 4   562   633 2020-12-18    91 Female PAIN              weekda…  11813591  2020
 5   562   640 2018-02-05    89 Female SEPSIS            weekda…   8895777  2018
 6   562   633 2021-03-31    92 Female PAIN              weekda…  12134872  2021
 7   562   640 2021-04-22    92 Female SHORT OF BREATH   weekda…  12204603  2021
 8   562   633 2019-07-20    97 Female URINARY TRACT IN… weeken…  10340046  2019
 9   516   706 2021-01-13    98 Female NO PROBLEM IDENT… weekda…  11895939  2021
10   596   640 2020-07-04    90 Male   ALTERED CONSCIOU… weeken…  11319542  2020
# ℹ 102,063 more rows
#
# Node Data: 815 × 4
  name                        longitude latitude type 
  <chr>                           <dbl>    <dbl> <chr>
1 1 ABERDEEN STREET RESERVOIR      145.    -37.7 racf 
2 1 ADENEY STREET CAMPERDOWN       143.    -38.2 racf 
3 1 AITKEN AVENUE DONALD           143.    -36.4 racf 
# ℹ 812 more rows

For spatial networks, the sfnetworks package (van der Meer et al. 2024) extends tidygraph by allowing spatial geometries to be incorporated directly within the tbl_graph object. It is useful for dealing with complex geometry where edges are not straight-line connections, such as road or transport networks. The package also allows for the standard spatial operation within the sf package to be performed within the network context.

However, the temporal data structure provided by tsibble is not directly compatible with tidygraph objects. As a result, validating temporal consistency requires converting data back to a tsibble object or performing a temporal check prior to the creation of tbl_graph. It introduces an important limitation, where common operations of filling missing observations are done outside the network context and therefore do not preserve the network topology. For example, if a node is missing in January 2020, how should the edges associated with that node be imputed? A sensible solution is to assume no edges exist during that period, which is reasonable in some cases but not in all cases. It highlights a key challenge in cleaning spatio-temporal network data, where temporal consistency and network structure should be considered jointly. The challenges require careful methodological decisions to ensure that both temporal attributes and the relational structure of the network remain coherent throughout the cleaning process.

Data Subsetting

Data subsetting is used to extract a subset of spatio-temporal network data based on spatial, temporal, and multivariate variables. This includes grouping data by time periods or regions, as well as filtering based on variable values and network characteristics (e.g., in-degree). In a network context, filtering operations need to account for topological dependencies between nodes and edges. When nodes are removed based on a condition, all edges incident to those nodes are also deleted (Figure 1). In contrast, when edges are removed, the nodes connected to those edges are preserved, since nodes can exist independently from an edge (Figure 2). The tidygraph supports these subsetting operations through the use of dplyr functions such as filter() and select(), which are applied separately on nodes and edges while maintaining the condition of the underlying network. Similarly to the data manipulation, users will need to switch between the node and edge tables to subset based on their attributes.

(a) Full network
(b) Node to remove
(c) Filtered network
Figure 1: Node filtering
(a) Full network
(b) Edges to remove
(c) Filtered network
Figure 2: Edge filtering
graph |>
  activate(edges) |>
  filter(between(year, 2020, 2021))
# A tbl_graph: 815 nodes and 48606 edges
#
# A directed acyclic multigraph with 59 components
#
# Edge Data: 48,606 × 9 (active)
    from    to casedate     age gender diagnosis         daytype single_id  year
   <int> <int> <date>     <dbl> <chr>  <chr>             <chr>       <dbl> <dbl>
 1   522   628 2020-10-03    90 Male   LACERATION        weeken…  11574122  2020
 2   562   633 2020-12-18    91 Female PAIN              weekda…  11813591  2020
 3   562   633 2021-03-31    92 Female PAIN              weekda…  12134872  2021
 4   562   640 2021-04-22    92 Female SHORT OF BREATH   weekda…  12204603  2021
 5   516   706 2021-01-13    98 Female NO PROBLEM IDENT… weekda…  11895939  2021
 6   596   640 2020-07-04    90 Male   ALTERED CONSCIOU… weeken…  11319542  2020
 7   279   769 2020-05-16    19 Male   POST ICTAL        weeken…  11183777  2020
 8   231   645 2020-08-06    46 Male   OTHER - SPECIFY   weekda…  11422186  2020
 9   231   645 2020-08-07    46 Male   OTHER - SPECIFY   weekda…  11415518  2020
10   265   657 2021-05-05    45 Male   NO PROBLEM IDENT… weekda…  12244819  2021
# ℹ 48,596 more rows
#
# Node Data: 815 × 4
  name                        longitude latitude type 
  <chr>                           <dbl>    <dbl> <chr>
1 1 ABERDEEN STREET RESERVOIR      145.    -37.7 racf 
2 1 ADENEY STREET CAMPERDOWN       143.    -38.2 racf 
3 1 AITKEN AVENUE DONALD           143.    -36.4 racf 
# ℹ 812 more rows

Network Sampling

Another important aspect of subsetting is understanding how sampling methods perform on network data. Real-world datasets are often not evenly distributed across multiple dimensions such as time, space, or variable groups. Some strata may contain more observations than others, and analysing these can directly impact the interpretation, as the larger group of strata may dominate the patterns seen. Sampling provides a way to subset the data while keeping it representative of the population. Stratified sampling, inparticular, helps with an imbalance case by dividing the data into subgroups and sampling within each group, ensuring that all groups are represented in the sampled data.

In the network context, sampling methods are generally categorised into the following (Chuong Nguyen 2025):

  • Node-based sampling selects a subset of nodes from the network and retains edges that are incident to the sampled nodes. This method is efficient and is usually implemented in large-scale studies (Ben-Eliezer et al. 2022). It often fails to capture important global structural properties such as connectivity and clustering.

  • Edge-based sampling samples a subset of edges directly and includes the nodes incident to those edges. This method is better at preserving structural pattern (Jiao 2024). However, it may introduce bias towards selecting nodes with higher degrees, resulting in biased sampled data.

There are many additional methods for sampling. Hu and Lau (2013) provides a comprehensive survey and taxonomy of graph sampling approaches, which are outside the scope of this project.

The tidygraph package provides a method for sampling the data for a tbl_graph object through a sample_n() function, although it is now recommended to use slice_sample() instead. A further limitation of the tbl_graph is that it does not directly support stratified (i.e., group_by) sampling. Instead, the tbl_graph object needs to be converted back to tibble (Müller and Wickham 2025), performing stratified sampling on the node or edge table, and then filtering the original network based on the sampled nodes or edges. This limitation shows that sampling operations for network objects can still be improved.

set.seed(1)

# Edges sampling
graph |> 
  activate(edges) |> 
  sample_n(size = 20)

# Stratified edges sampling
edges_kept <- graph |> 
  activate(edges) |> 
  as_tibble() |> 
  group_by(daytype) |> 
  sample_n(size = 10) |> 
  pull(single_id)

graph |> 
  activate(edges) |> 
  filter(single_id %in% edges_kept) |> 
  activate(nodes) |> 
  filter(!node_is_isolated())
# A tbl_graph: 34 nodes and 20 edges
#
# A directed acyclic multigraph with 15 components
#
# Node Data: 34 × 4 (active)
   name                               longitude latitude type 
   <chr>                                  <dbl>    <dbl> <chr>
 1 101 PUNT ROAD WINDSOR                   145.    -37.9 racf 
 2 12 HAVELOCK STREET DUNOLLY              144.    -36.9 racf 
 3 16 HOPETOUN ROAD WARRNAMBOOL            142.    -38.4 racf 
 4 17 EGGINTON STREET BRUNSWICK WEST       145.    -37.8 racf 
 5 177-179 TINDALS ROAD DONVALE            145.    -37.8 racf 
 6 18 VILLA ROAD SPRINGVALE                145.    -37.9 racf 
 7 2 NICOL AVENUE BURNSIDE                 145.    -37.8 racf 
 8 203 NAPIER STREET SOUTH MELBOURNE       145.    -37.8 racf 
 9 205 WARRANDYTE ROAD RINGWOOD NORTH      145.    -37.8 racf 
10 23 FOREST DRIVE FRANKSTON NORTH         145.    -38.1 racf 
# ℹ 24 more rows
#
# Edge Data: 20 × 9
   from    to casedate     age gender diagnosis          daytype single_id  year
  <int> <int> <date>     <dbl> <chr>  <chr>              <chr>       <dbl> <dbl>
1    14    28 2020-07-04    96 Male   AVULSION           weeken…  11320809  2020
2    15    27 2021-04-13    66 Female CONFUSION          weekda…  12176903  2021
3     8    21 2019-02-15    87 Male   WOUND INFLAMMATIO… weekda…   9869916  2019
# ℹ 17 more rows

As discussed in Section 2.1.2, nodes in a network can exist independently without incident edges. Thus, the edge-based sampling does not automatically remove nodes that become isolated after sampling. To remove these nodes, they must be explicitly removed by filtering the node table using the node_is_isolated() function.

Data visualisation

Data visualisation helps reveal patterns, anomalies and relationships that may not be apparent from numerical summaries alone. Network data is often viewed as connections or flows between nodes/locations, and network-based visualisation allows for easier communication to a broader audience. For a simple network without spatial coordinates, placing nodes and edges in a visualisation requires the use of a graph layout algorithm, such as the Kamada-Kawai layout (Kamada and Kawai 1989). Depending on the chosen algorithm, the positions of nodes and edges can be different even on the same network dataset. With spatial information, visualising these becomes more straightforward, as longitude and latitude can be used to specify the actual location of the nodes, with edges represented as lines connecting these locations.

simple_graph |> 
  ggraph(x = long, y = lat) +
  geom_sf(data = vic_map, color = "white") +
  geom_edge_link(alpha = 0.1) +
  geom_node_point(aes(color = category)) +
  scale_color_brewer(name = "Facility", palette = "Set1")
Figure 3: An ambulance transfers network in Victoria between residential aged care facilities and hospitals.

Visualising high-dimensional network data can be challenging, especially through a static visualisation alone. The current tool for network visualisation in R is the ggraph package (Pedersen 2024a), which extends the ggplot2 package (Wickham 2016) to support relational data structures such as networks, graphs, and trees. The ggraph package is effective at visualising static networks, offering a range of layout algorithms for placing the node locations while keeping the same familiar ggplot2 syntax. The support for interactive network visualisation with ggraph is currently limited. The reason static network visualisation is hard is that the amount of information that can be mapped to the visualisation is limited within a single figure. As shown in Figure 3, just a simple network representation can already become cluttered quickly. Answering detailed questions such as the number of transfers between a specific RACF and Hospital, or the name of a particular RACF, is difficult using static visualisation alone. Interactive visualisation help with these limitation by layering additional information onto the visualisation, allowing for further exploration.

interactive_vis_node <- simple_graph |> 
  mutate(name = str_remove(name, "'")) |> 
  ggraph(x = long, y = lat) +
  geom_sf(data = vic_map, color = "white") +
  geom_edge_link(alpha = 0.1) +
  geom_point_interactive(aes(x = x, 
                             y = y,
                             color = category,
                             tooltip = name,
                             data_id = name)) +
  scale_color_brewer(name = "Facility", palette = "Set1")

girafe(ggobj = interactive_vis_node,
       options = list(
         opts_hover(css = "fill:lightblue;stroke:grey;stroke-width:0.5px"),
         opts_zoom(min = 0.5, max = 3)
       ))
Figure 4: An ambulance transfers network in Victoria between residential aged care facilities and hospitals with an interactive node.
interactive_vis_edge <- simple_graph |> 
  ggraph(x = long, y = lat) +
  geom_sf(data = vic_map, color = "white") +
  geom_node_point(aes(color = category)) +
  geom_segment_interactive(
    data = simple_graph |> activate(edges) |> as_tibble() |> mutate(id = row_number()),
    alpha = 0.2,
    aes(x = long_racf,
        y = lat_racf,
        xend = long_hosp,
        yend = lat_hosp,
        tooltip = weight,
        data_id = id)) +
  scale_color_brewer(name = "Facility", palette = "Set1")
  

girafe(ggobj = interactive_vis_edge,
       options = list(
         opts_hover(css = "fill:lightblue;stroke:grey;stroke-width:0.5px"),
         opts_zoom(min = 0.5, max = 3)
       ))
Figure 5: An ambulance transfers network in Victoria between residential aged care facilities and hospitals with an interactive edge.

The ggiraph package (Gohel and Skintzos 2025) provides an interactive element to the visualisation through the ggplot2 extension. Creating interactive network visualisations, therefore, typically requires combining ggraph for layout construction with ggiraph for interactivity. Getting this to work together requires an insight into how ggraph works, which is not a seamless implementation. For example, ggraph provides geom_node_* and geom_edge_* functions for nodes and edges geometry, respectively, but these do not natively support interactivity. To make nodes interactive, geom_point_interactive() need to be use instead of geom_node_point(). Making edges interactive is more complex as it requires the use of geom_segment_interactive() and mapping the start and end coordinates (x, y, xend, yend) for each edge. By that point, if both node and edge need interactivity, then ggraph is not needed, and instead the node and edge can be treated as a different dataset and implement geom_point and geom_segment for drawing the node and edge.

Conclusion

Current tools such as tidygraph have several limitations that are important for exploratory data analysis. First, temporal data are not easy to clean within tidygraph, and there is no direct support for filtering temporal networks. As a result, issues such as filling missing timestamps and inconsistent time period are currently handled outside the network context, which may affect the network validity. Second, support for data sampling method is limited, with most method work outside of tidygraph object. Finally, visualisation capabiliteis are constrained by the ggraph package, as layout calculations are handled internally, making it difficult to access and reuse layout information. To address these limitations, a propose new data object with better integration with existing packages such as tibble, tsibble, sf, dplyr, and ggplot2, enabling more effortless multivariate spatio-temporal network exploration.

Part B: An Exploratory Analysis of the COVID-19 Impact on Ambulance Efficiency and Transfer Volume

The efficiency of ambulance transfer is critical in both scheduled and emergency cases to minimise the resources used and affect on the patient outcomes. In scheduled transfers, the longer the transfer, the fewer ambulances are available in the system, which potentially limits the number of amublances available during the peak demand or emergency state. In emergency transfers, it directly influences the outcomes of the patient, especially for time-sensitive conditions such as trauma, cardiac emergencies, and stroke. Analysing ambulance transfer efficiency is essential for improving the quality of emergency care, ensuring an adequate number of available ambulances, and strengthening the system. In addition to efficiency, ambulance transfer volume is a key factor for better ambulance management and resource allocation (Chen et al. 2015).

During the COVID-19 pandemic, government-imposed lockdowns created significant challenges for ambulance services. Previous studies have shown that the pandemic affected the transfer of older individuals from residential aged care facilities to hospitals (Wyer et al. 2024; Nair et al. 2023; Botan et al. 2023). These restrictions limited ambulance availability and highlighted potential vulnerabilities in the reliance of ambulance transfers for aged care residents. Examinging ambulance efficiency and transfer volumes during lockdown periods can informs strategies to improve system resilience and preparedness for the future emergencies.

The ambulance transfer data was provided by Alfred Health. The dataset included the following aspects: spatial, temporal, and multivariate information on each transfer. The spatial covers the locaiton of the aged care facilities and the hospital (destination) in latitude and longitude. While the temporal provides the date of the transfers, which cover the period between January 2018 and May 2022. For multivariate information, it covers hospitals, aged care facilities, and patient-level details.

Efficiency

Measuring efficiency can be done in many ways; one of the proxies that can be used for estimating these is the distance from the aged care facilities to hospital. The euclidean distance will be use for calculating a transfer disatnce. It is the shortest distance between two points using the latitude and longitude. This measure should give a reasonable estimate for less computing compared to the actual road distance. Note that the limitation is that it ignores the acutual streeet network distance, which is not a straight lint, and there, the distance between RACF and the hospital will generally be longer.

Figure 6: Histogram of an ambulance transfer distance for emergency and scheduled cases.

From Figure 6, there is zero distance transfer in the data, which suggest either a data entry error or that the aged care facilities and hospital are next to each other (co-located RACF). In the latter case, this might suggest that it may not need the ambulance to transfer the patient. After filtering the distance data to zero and manually going through them, it can be seen that the locaiton of the aged care facility and hospital is, in fact, the same location.

In an emergency case, it is important that thet transfer is done quickly (Section 1). It means that most of the time, the RACF that is co-located with the hospital should not have a patient transferred to another hospital. One thing that will need to be considered is that the co-located hospital might not be able to handle the emergency case. In this case, the patient should be transferred to another closest emergency-capable hospital. This information can be verified through the hospital data, which has a column indicating whether the hospital has an emergency department or not.

Based on this information, the efficiency can be improved through fewer transfers to the other hospital that is further away. espcially the co-located RACF. To check this, the distance is calculated between the RACFs and hospitals for all the possible combinations to see which RACF is next to which hospital.

Co-located RACF
dispatch n
APPOINTMENT - PRE BOOKED 388
CLINICIAN-ACTIONED MEDIUM ACUITY EVENT 280
MENTAL HEALTH: ACUTE PROBLEMS 259
AMBULANCE-URGENT WITHIN 25 MINS 239
MENTAL HEALTH: NON URGENT 239
HOSPITAL ADMISSION - ON DAY 219
EMERGENCY DEPT TRANSFER - ON DAY 169
AMBULANCE EMERGENCY MULTILEG 166
DR REQUESTING ATTENDANCE WITHIN 25 MIN 126
Non co-located RACF
dispatch n
EMERGENCY DEPT TRANSFER - ON DAY 12942
AMBULANCE-URGENT WITHIN 25 MINS 12604
REFERRAL MEDIUM ACUITY TIMEFRAME SPECIFIED 7329
APPOINTMENT - PRE BOOKED 6745
DR REQUESTING ATTENDANCE WITHIN 25 MIN 5808
NO CONSENT-NOT URGENT/NO ASP 1-HOUR 5283
REFERRAL MEDIUM ACUITY ONE HOUR 4421
RENAL PATIENT TRANSPORT 3826
AMBULANCE-CRITICAL 3692
DR REQUESTING ATTENDANCE WITHIN 90 MINUTES 2633
UNCONSCIOUS/FAINTING, NOT ALERT 2383
NEPT 000 EVENT TO ERTCOMM 1901
CLINICIAN-ACTIONED MEDIUM ACUITY EVENT 1366
BREATHING PROBLEMS: DIFF SPEAKING B/W BREATHS 1362
UNCONSCIOUS/FAINTING, UNCONSCIOUS - EFFECTIVE BREATHING 1081
APPOINTMENT - ON DAY 874
CHEST PAIN: BREATHING NORMALLY >35 861
NEPT BOOKED EVENT TO ERTCOMM 824
HOSPITAL ADMISSION - PRE BOOKED 787
CHEST PAIN: DIFF SPEAKING B/W BREATHS 768
BREATHING PROBLEMS: NOT ALERT 762
HOSPITAL ADMISSION - ON DAY 699
CHEST PAIN: CLAMMY 671
REFERRAL LOW ACUITY TIMEFRAME SPECIFIED 639
FALLS, POSSIBLY DANGEROUS BODY AREA (ON THE GROUND OR FLOOR) 622
Figure 7: An ambulance transfer distance distribution for emergency and scheduled cases in the co-located RACF.

The expectation of the transfer distance for the co-located RACF would be, on average, less than that of the other case. However, from Figure 7, it can be seen that, on average, the transfer distance for the co-located RACF is higher. It actually raises more questions to why this is the case.

Figure 8: The co-located RACF transfer network.
Figure 9: The location of the RACF with the co-located RACF highlighted.

From Figure 8, it can be observed that a lot of patients got transferred from the regional to the Melbourne area. Though visualising this can be hard on the mpa because the Melbourne area is smaller than the regional area, therefore, a lot of points inside the Melbourne are will get squished together, which makes it hard to visualise. One of the solutions is to groups the hospitals with Melbourne together by their emergency capabilities.

Figure 10: The co-located RACF transfer network with Melbourne hospital grouped.
Figure 11: The co-located RACF transfer network with Melbourne hospital grouped on the non-spatial layout.

Transfer Distance

To further analyse the common reason for the patient transfer (RACF next to Hospital), the transfer distance will be grouped into the following;

  • Zero distance (RACF and Hospital that is co-located)

  • Between 0 and 10km (Short-distance)

  • Between 10km and 50km (Medium-distance)

  • Above 50km (Long-distance)

Dispatch Reason
Zero Distance
dispatch n
AMBULANCE-URGENT WITHIN 25 MINS 18
AMBULANCE-CRITICAL 10
DR REQUESTING ATTENDANCE WITHIN 25 MIN 9
MENTAL HEALTH: ACUTE PROBLEMS 7
Short Distance
dispatch n
AMBULANCE-URGENT WITHIN 25 MINS 88
DR REQUESTING ATTENDANCE WITHIN 25 MIN 72
CLINICIAN-ACTIONED MEDIUM ACUITY EVENT 51
EMERGENCY DEPT TRANSFER - ON DAY 47
NO CONSENT-NOT URGENT/NO ASP 1-HOUR 31
AMBULANCE-CRITICAL 30
UNCONSCIOUS/FAINTING, NOT ALERT 21
REFERRAL MEDIUM ACUITY TIMEFRAME SPECIFIED 19
REFERRAL MEDIUM ACUITY ONE HOUR 14
Medium Distance
dispatch n
MENTAL HEALTH: NON URGENT 116
MENTAL HEALTH: ACUTE PROBLEMS 95
AMBULANCE EMERGENCY MULTILEG 80
AMBULANCE-URGENT WITHIN 25 MINS 78
EMERGENCY DEPT TRANSFER - ON DAY 61
CLINICIAN-ACTIONED MEDIUM ACUITY EVENT 53
NEPT BOOKED EVENT TO ERTCOMM 50
Long Distance
dispatch n
CLINICIAN-ACTIONED MEDIUM ACUITY EVENT 171
MENTAL HEALTH: ACUTE PROBLEMS 145
MENTAL HEALTH: NON URGENT 110
AMBULANCE EMERGENCY MULTILEG 85
CLINICIAN-ACTIONED LOW ACUITY EVENT 57
EMERGENCY DEPT TRANSFER - ON DAY 57
AMBULANCE-URGENT WITHIN 25 MINS 55
NON EMERGENCY MULTILEG 42
NEPT BOOKED EVENT TO ERTCOMM 39
Diagnosis Reason
Zero Distance
diagnosis n
PAIN 12
FRACTURE/S 10
OTHER - SPECIFY 10
NO PROBLEM IDENTIFIED 9
PSYCHIATRIC EPISODE 9
Short Distance
diagnosis n
PAIN 78
OTHER - SPECIFY 52
SEPSIS 30
FEBRILE 26
ALTERED CONSCIOUS STATE 21
FRACTURE/S 20
INFECTION - OTHER / NOT LISTED 18
SHORT OF BREATH 18
Medium Distance
diagnosis n
PSYCHIATRIC EPISODE 174
OTHER - SPECIFY 84
PAIN 81
SUICIDAL IDEATION 62
FRACTURE/S 49
UNKNOWN PROBLEM 35
NO PROBLEM IDENTIFIED 29
STROKE 19
Long Distance
diagnosis n
PSYCHIATRIC EPISODE 175
OTHER - SPECIFY 138
PAIN 90
SUICIDAL IDEATION 73
FRACTURE/S 59
ACUTE CORONARY SYNDROME 48
UNKNOWN PROBLEM 34
INFECTION - OTHER / NOT LISTED 29
NO PROBLEM IDENTIFIED 23
STROKE 21

From the table, the dispatch and diagnosis reasons seem to point to the same thing, which is that for the shorter distance, the reason seems to be about the injuries that require quick transfer, while the medium to long-distance suggests mental health issues. It is expected that the emergency transfer that requires quick attention should be done quickly.

Figure 12: Histogram of an ambulance transfer distance for emergency and scheduled cases, removing zero distances.

Progress Update

Data cleaning and subsetting

  • Explored multiple network analysis software including tidygraph, igraph, and network for comparison.

  • Apply these software on Caribou dataset, with the main focus on temporal data gap that affect total distance travelled calculations.

  • Considering imputation strategies for temporal network data to adress missingness.

Data visualisation

  • Experiment with crosstalk to crete linked, side-by-side visualisations combining network and simple descriptive plot.

  • Constructed a supervisor-student relationship network to understand network theory.

  • Explored ggiraph for interactive network visualisations.

Exploratory data analysis

  • Preliminaries analysis of the ambulance transfer effieciency

Project 2: Dynamics Infectious Disease Modelling using a Generalised Ambulance Model

Part A: Generalised Ambulance Model

This project focuses on understanding the impact of any changes in the ambulance transfer network affect the spread of infectious diseases, particularly among older populations in residential aged care facilities. Ambulance transfer create a way in which infections can be transmitted throughout facilities and hospitals (Gruber et al. 2013). Variations in transfer volume, patterns, or constraints can alter the structure of the network and, in turn, influence outbreak dynamics. There are a couple

Part B: Dynamics Infectious Disease Modelling

Infectious diesease dynamics describe how the disease is spread and evolves within the population over time.

Progress Update

  • Try a statistical network model such as ergm and intergraph for switching between igraph and network

References

Ben-Eliezer, Omri, Talya Eden, Joel Oren, and Dimitris Fotakis. 2022. “Sampling Multiple Nodes in Large Networks: Beyond Random Walks.” In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, 37–47.
Botan, Vanessa, Graham Law, Despina Laparidou, Viet-Hai Phung, Ffion Curtis, Gregory Whitley, Joseph Akanuwe, et al. 2023. “PP63 Variations in the Number of Ambulance Attendances to Care Homes Before and During Covid-19 Pandemic: An Interrupted Time Series Analysis.” BMJ Publishing Group Ltd; the British Association for Accident ….
Butts, Carter T. 2008. “Network: A Package for Managing Relational Data in r.” Journal of Statistical Software 24 (2). https://doi.org/10.18637/jss.v024.i02.
Cardenas, Nicolas Cespedes, Kimberly VanderWaal, Flávio Pereira Veloso, Jason Onell Ardila Galvis, Marcos Amaku, and Jose HH Grisi-Filho. 2021. “Spatio-Temporal Network Analysis of Pig Trade to Inform the Design of Risk-Based Disease Surveillance.” Preventive Veterinary Medicine 189: 105314.
Chen, Albert Y, Tsung-Yu Lu, Matthew Huei-Ming Ma, and Wei-Zen Sun. 2015. “Demand Forecast Using Data Analytics for the Preallocation of Ambulances.” IEEE Journal of Biomedical and Health Informatics 20 (4): 1178–87.
Chuong Nguyen, Quoc. 2025. “Network Sampling: An Overview and Comparative Analysis.” arXiv e-Prints, arXiv–2504.
Csárdi, Gábor, Tamás Nepusz, Vincent Traag, Szabolcs Horvát, Fabio Zanini, Daniel Noom, and Kirill Müller. 2026. igraph: Network Analysis and Visualization in r. https://doi.org/10.5281/zenodo.7682609.
Fernández-Gracia, Juan, Jukka-Pekka Onnela, Michael L Barnett, Vı́ctor M Eguı́luz, and Nicholas A Christakis. 2017. “Influence of a Patient Transfer Network of US Inpatient Facilities on the Incidence of Nosocomial Infections.” Scientific Reports 7 (1): 2930.
Gohel, David, and Panagiotis Skintzos. 2025. Ggiraph: Make ’Ggplot2’ Graphics Interactive. https://doi.org/10.32614/CRAN.package.ggiraph.
Gruber, Isabella, Ursel Heudorf, Guido Werner, Yvonne Pfeifer, Can Imirzalioglu, Hanns Ackermann, Christian Brandt, Silke Besier, and Thomas A Wichelhaus. 2013. “Multidrug-Resistant Bacteria in Geriatric Clinics, Nursing Homes, and Ambulant Care–Prevalence and Risk Factors.” International Journal of Medical Microbiology 303 (8): 405–9.
Harmsen, AMK, Georgios F Giannakopoulos, Patrick R Moerbeek, Elise P Jansma, HJ Bonjer, and Frank W Bloemers. 2015. “The Influence of Prehospital Time on Trauma Patients Outcome: A Systematic Review.” Injury 46 (4): 602–9.
Harris, Anthony, and Anurag Sharma. 2018. “Estimating the Future Health and Aged Care Expenditure in Australia with Changes in Morbidity.” PloS One 13 (8): e0201697.
Hu, Pili, and Wing Cheong Lau. 2013. “A Survey and Taxonomy of Graph Sampling.” arXiv Preprint arXiv:1308.5865.
Jiao, Bo. 2024. “Sampling Unknown Large Networks Restricted by Low Sampling Rates.” Scientific Reports 14 (1): 13340.
Kamada, Tomihisa, and Satoru Kawai. 1989. “An Algorithm for Drawing General Undirected Graphs.” Information Processing Letters 31 (1): 7–15. https://doi.org/https://doi.org/10.1016/0020-0190(89)90102-6.
Kearney, Anne R, and Daniel Winterbottom. 2006. “Nearby Nature and Long-Term Care Facility Residents: Benefits and Design Recommendations.” Journal of Housing for the Elderly 19 (3-4): 7–28.
Müller, Kirill, and Hadley Wickham. 2025. Tibble: Simple Data Frames. https://doi.org/10.32614/CRAN.package.tibble.
Nair, Shruti Premshankar, Ashley L Quigley, Aye Moa, Abrar Ahmad Chughtai, and Chandini Raina Macintyre. 2023. “Monitoring the Burden of COVID-19 and Impact of Hospital Transfer Policies on Australian Aged-Care Residents in Residential Aged-Care Facilities in 2020.” BMC Geriatrics 23 (1): 507.
Parohan, Mohammad, Sajad Yaghoubi, Asal Seraji, Mohammad Hassan Javanbakht, Payam Sarraf, and Mahmoud Djalali. 2020. “Risk Factors for Mortality in Patients with Coronavirus Disease 2019 (COVID-19) Infection: A Systematic Review and Meta-Analysis of Observational Studies.” The Aging Male 23 (5): 1416–24.
Pebesma, Edzer. 2018. Simple Features for R: Standardized Support for Spatial Vector Data.” The R Journal 10 (1): 439–46. https://doi.org/10.32614/RJ-2018-009.
Pedersen, Thomas Lin. 2024a. Ggraph: An Implementation of Grammar of Graphics for Graphs and Networks. https://doi.org/10.32614/CRAN.package.ggraph.
———. 2024b. Tidygraph: A Tidy API for Graph Manipulation. https://CRAN.R-project.org/package=tidygraph.
Rao, K Venkateswara, A Govardhan, and KV Chalapati Rao. 2012. “Spatiotemporal Data Mining: Issues, Tasks and Applications.” International Journal of Computer Science and Engineering Survey 3 (1): 39.
van der Meer, Lucas, Lorena Abad, Andrea Gilardi, and Robin Lovelace. 2024. Sfnetworks: Tidy Geospatial Networks. https://luukvdmeer.github.io/sfnetworks/.
Wang, Earo, Dianne Cook, and Rob J Hyndman. 2020. “A New Tidy Data Structure to Support Exploration and Modeling of Temporal Data.” Journal of Computational and Graphical Statistics 29 (3): 466–78. https://doi.org/10.1080/10618600.2019.1695624.
Wickham, Hadley. 2014. “Tidy Data.” Journal of Statistical Software 59 (10): 1–23. https://doi.org/10.18637/jss.v059.i10.
———. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.
Wickham, Hadley, Romain François, Lionel Henry, Kirill Müller, and Davis Vaughan. 2023. Dplyr: A Grammar of Data Manipulation. https://doi.org/10.32614/CRAN.package.dplyr.
Wyer, Leanna, Yair Guterman, Vivian Ewa, Eddy Lang, Peter Faris, and Jayna Holroyd-Leduc. 2024. “The Impact of the COVID-19 Pandemic on Transfers Between Long-Term Care and Emergency Departments Across Alberta.” BMC Emergency Medicine 24 (1): 9.